<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Upgrading Applications</title>
</head>

<body>
<h1>Upgrading Applications</h1>

<p>Applications must frequently deal with data that lives longer than the
programs that create it. Sometimes the structure of that data changes over
time, but new versions of a program must be able to accomodate data created
by an older version. These versions may change very quickly, especially
during development of new code. Sometimes different versions of the same
program are running at the same time, sharing data across a network
connection. These situations all result in a need for a way to upgrade data
structures. </p>

<h2>Basic Persistence: Application and .tap files</h2>

<p>Simple object persistence (using <code>pickle</code> or
<code>jelly</code>) provides the fundamental <q>save the object to disk</q>
functionality at application shutdown. If you use the <code class="API"
base="twisted.application.service">Application</code> object, every object
referenced by your Application will be saved into the
<code>-shutdown.tap</code> file when the program terminates. When you use
<code>twistd</code> to launch that new .tap file, the Application object
will be restored along with all of its referenced data.</p>

<p>This provides a simple way to have data outlive any particular invocation
of your program: simply store it as an attribute of the Application. Note
that all Services are referenced by the Application, so their attributes
will be stored as well. Ports that have been bound with listenTCP (and the
like) are also remembered, and the sockets are created at startup time (when
<code>Application.run</code> is called).</p>

<p>To influence the way that the <code class="API"
base="twisted.application.service">Application</code> is persisted, you can adapt
it to <code class="API">twisted.persisted.sob.IPersistable</code> and use
the <code class="python">setStyle(style)</code> method with
a string like <q>pickle</q>, <q>xml</q>, or <q>source</q>. These use different serializers (and different
extensions: <q>.tap</q>, <q>.tax</q>, and <q>.tas</q> respectively) for the
saved Application.</p>

<p>You can manually cause the application to be saved by calling its
<code>.save</code> method (on the <code class="API">twisted.persisted.sob.IPersistable</code>
adapted object).</p>


<h2>Versioned: New Code Meets Old Data</h2>

<p>So suppose you're running version 1 of some application, and you want to
upgrade to version 2. You shut down the program, giving you a .tap file that
you could restore with twistd to get back to the same state that you had
before. The upgrade process is to then install the new version of the
application, and then use twistd to launch the saved .tap file. The old data
will be loaded into classes created with the new code, and now you'll have a
program running with the new behavior but the old data.</p>

<p>But what about the data structures that have changed? Since these
structures are really just pickled class instances, the real question is
what about the class definitions that have changed? Changes to class methods
are easy: nothing about them is saved in the .tap file. The issue is when
the data attributes of a instance are added, removed, or their format is
changed.</p>

<p>Twisted provides a mechanism called <code class="API"
base="twisted.persisted.styles">Versioned</code> to ease these upgrades.
Each version of the data structure (i.e. each version of the class) gets a
version number. This number must change every time you add or remove a data
attribute to the class. It must also change every time you modify one of
those data attributes: for example, if you use a string in one version and
an integer in another, those versions must have different version numbers.
</p>

<p>The version number is defined in a class attribute named
<code>persistenceVersion</code>. This is an integer which will be stored in
the .tap file along with the rest of the instance state. When the object is
unserialized, the saved persistenceVersion is compared against the current
class's value, and if they differ, special upgrade methods are called. These
methods are named <code>upgradeToVersionNN</code>, and there must be one for
each intermediate version. These methods are expected to manipulate the
instance's state from the previous version's format into that of the new
version.</p>

<p>To use this, simply have your class inherit from <code class="API"
base="twisted.persisted.styles">Versioned</code>. You don't have to do this
from the very beginning of time: all objects have an implicit version number
of <q>0</q> when they don't inherit from Versioned. So when you first make
an incompatible data-format change to your class, add Versioned to the
inheritance list, and add an <code>upgradeToVersion1</code> method.</p>

<p>For example, suppose the first version of our class saves an integer
which measures the size of a line. We release this as version 1.0 of our
neat application:</p>

<pre class="python">
class Thing:
  def __init__(self, length):
      self.length = length
</pre>

<p>Then we fix some bugs elsewhere, and release versions 1.1 and 1.2 of the
application. Later, we decide that we should add some units to the length,
so that people can refer to it in inches or meters. Version 1.3 is shipped
with the following code:</p>

<pre class="python">
class Thing(Versioned):
  persistenceVersion = 1
  def __init__(self, length, units):
      self.length = "%d %s" % (length, units)
  def upgradeToVersion1(self):
      self.length = "%d inches" % self.length
</pre>

<p>Note that we must make an assumption about what the previous value meant:
in this case, we assume the number was in inches.</p>

<p>1.4 and 1.5 are shipped with other changes. Then in version 1.6 we decide
that saving the two values as a string was foolish and that it would be
better to save the number and the string separately, using a tuple. We ship
1.6 with the following:</p>

<pre class="python">
class Thing(Versioned):
  persistenceVersion = 2
  def __init__(self, length, units):
      self.length = (length, units)
  def upgradeToVersion1(self):
      self.length = "%d inches" % self.length
  def upgradeToVersion2(self):
      (length, units) = self.length.split()
      self.length = (length, units)
</pre>

<p>Note that we must provide both <code>upgradeToVersion1</code>
<em>and</em> <code>upgradeToVersion2</code>. We have to assume that the
saved .tap files which will be provided to this class come from a random
assortment of old versions: we must be prepared to accept anything ever
saved by a released version of our application.</p>

<p>Finally, version 2.0 adds multiple dimensions. Instead of merely
recording the length of a line, it records the size of an N-dimensional
rectangular solid. For backwards compatiblity, all 1.X version of the
program are assumed to be dealing with a 1-dimensional line. We change the
name of the attribute from <code>.length</code> to <code>.size</code> to
reflect the new meaning.</p>

<pre class="python">
class Thing(Versioned):
  persistenceVersion = 3
  def __init__(self, dimensions):
      # dimensions is a list of tuples, each is (length, units)
      self.size = dimensions
      self.name = ["line", "square", "cube", "hypercube"][len(dimensions)]
  def upgradeToVersion1(self):
      self.length = "%d inches" % self.length
  def upgradeToVersion2(self):
      (length, units) = self.length.split()
      self.length = (length, units)
  def upgradeToVersion3(self):
      self.size = [self.length]
      del self.length
      self.name = "line"
</pre>

<p>If a .tap file from the earliest version of our program were to be loaded
by the latest code, the following sequence would occur for each Thing
instance contained inside:</p>

<ol>

  <li>An instance of Thing would be created, with a __dict__ that contained
  a single attribute <code>.size</code>, which was an integer, like
  <q>5</q>.</li>

  <li><code class="python">self.upgradeToVersion1()</code> would be called,
  changing <code>self.size</code> into a string, like <q>5 inches</q>.</li>

  <li><code class="python">self.upgradeToVersion2()</code> would be called,
  changing <code>self.size</code> into a tuple, like (5,
  <q>inches</q>).</li>

  <li>Finally, <code class="python">self.upgradeToVersion3()</code> would be
  called, creating <code>self.size</code> as a list holding a single
  dimension, like [(5, <q>inches</q>)]. The old <code>.length</code>
  attribute is deleted, and a new <code>.name</code> is created with the
  type of shape this instance represents (<q>line</q>).</li>
  
</ol>

<p>Some hints for the <code>upgradeVersion</code> methods:</p>

<ul>

  <li>They must do everything the <code>__init__</code> method would have
  done, as well as any methods that might have been called during the
  lifetime of the object.</li>

  <li>If the class has (or used to have) methods which can add attributes
  that weren't created in <code>__init__</code>, then the saved object may
  have a haphazard subset of those attributes, depending upon which methods
  were called. The upgradeVersion methods must be prepared to deal with
  this. <code>hasattr</code> and <code>.get</code> may be useful.</li>

  <li>Once you have released a class with a given
  <code>upgradeVersion</code> method, you should never change that method.
  (assuming you care about infinite backwards compatibility).</li>

  <li>You must add a new <code>upgradeVersion</code> method (and bump the
  persistenceVersion value) for each and every release that has a different
  set of data attributes than the previous release.</li>

  <li><code>Versioned</code> works by providing <code>__setstate__</code>
  and <code>__getstate__</code> methods. You probably don't want to override
  these methods without being very careful to call the Versioned versions at
  exactly the right time. It also requires a <code>doUpgrade</code> function
  to be called after all the objects are loaded. This is done automatically
  by <code>Application.run</code>.</li>

  <li>Depending upon how they are serialized, <code>Versioned</code> objects
  can probably be sent across a network connection, and the upgrade process
  can be made to occur upon receipt. (You'll want to look at the <code
  class="API" base="twisted.persisted.styles">requireUpgrade</code>
  function). This might be useful in providing compability with an older
  peer. Note, however, that <code>Versioned</code> does not let you go
  backwards in time; there is no <code>downgradeVersionNN</code> method.
  This means it is probably only useful for compatibility in one direction:
  the newer-to-older direction must still be explicitly handled by the
  application.</li>
  
  <li>In general, backwards compatibility is handled by pretending that the
  old code was restricting itself to a narrow subset of the capabilities of
  the new code. The job of the upgrade routines is then to translate the old
  representation into a new one.</li>
  
</ul>

<p>For more information, look at the doc strings for <code class="API"
base="twisted.persisted">styles.Versioned</code>, as well as the <code
class="API" base="twisted.internet">app.Application</code> class and the <a
href="application.xhtml">Application HOWTO</a>.</p>


<h2>Rebuild: Loading New Code Without Restarting</h2>

<p><code class="API">Versioned</code> is good for handling changes between
released versions of your program, where the application state is saved on
disk during the upgrade. But while you are developing that code, you often
want to change the behavior of the running program, <em>without</em> the
slowdown of saving everything out to disk, shutting down, and restarting.
Sometimes it will be difficult or time-consuming to get back to the previous
state: the running program could include ephemeral objects (like open
sockets) which cannot be persisted.</p>

<p><code class="API">twisted.python.rebuild</code> provides a function
called <code>rebuild</code> which helps smooth this cycle. It allows objects
in a running program to be upgraded to a new version of the code without
shutting down.</p>

<p>To use it, simply call <code class="python">rebuild</code> on the module
that holds the classes you want to be upgraded. Through deep <code
class="python">gc</code> magic, all instances of classes in that module will
be located and upgraded.</p>

<p>Typically, this is done in response to a privileged command sent over a
network connection. The usual development cycle is to start the server, get
it into an interesting state, see a problem, edit the class definition, then
push the <q>rebuild yourself</q> button. That <q>button</q> could be a magic
web page which, when requested, runs <code
class="python">rebuild(mymodule)</code>, or a special IRC command, or
perhaps just a socket that listens for connections and accepts a password to
trigger the rebuild. (You want this to be a privileged operation to prevent
someone from making your server do a rebuild while you're in the middle of
editing the code).</p>

<p>A few useful notes about the rebuild process:</p>

<ul>
  <li>If the module has a top-level attribute named
  <code>ALLOW_TWISTED_REBUILD</code>, this attribute must evaluate to True.
  Should it be false, the rebuild attempt will raise an exception.</li>

  <li>Adapters (from <code class="API">twisted.python.components</code>) use
  top-level registration function calls. These are handled correctly during
  rebuilds, and the usual duplicate registration errors are not raised.</li>

  <li>Rebuilds may be slow: every single object known to the interpreter
  must be examined to see if it is one of the classes being changed.</li>
</ul>

<p>Finally, note that <code class="API"
base="twisted.python.rebuild">rebuild</code> <em>cannot</em> currently be
mixed with <code class="API"
base="twisted.persisted.styles">Versioned</code>. <code>rebuild</code> does
not run any of the classes' methods, whereas <code>Versioned</code> works by
running <code>__setstate__</code> during the load process and
<code>doUpgrade</code> afterwards. This means <code>rebuild</code> can only
be used to process upgrades that do not change the data attributes of any of
the involved classes. Any time attributes are added or removed, the program
must be shut down, persisted, and restarted, with upgradeToVersionNN methods
used to handle the attributes. (this may change in the future, but for now
the implementation is easier and more reliable with this restriction).</p>

</body> </html>
